Topic model について

Words near each other

・ Toplița River (Jiul de Vest)
・ Toplița River (Mureș)
・ Toplița River (Sucevița)
・ Toplița River (Tazlăul Sărat)
・ Toplița River (Timiș)
・ Toplița River (Vâlsan)
・ Toplița, Hunedoara
・ Toplo
・ Toploader
・ Toplofikatsiya Plovdiv
・ Toplofikatsiya Ruse
・ Toplofikatsiya Sofia
・ Toplou Monastery
・ Topic Maps
・ Topic marker
・ Topic model
・ Topic Records
・ Topic sentence
・ Topic-based authoring
・ Topic-based vector space model
・ Topic-prominent language
・ Topic-Sensitive PageRank
・ Topica
・ Topical anesthetic
・ Topical decongestant
・ Topical irritation agents
・ Topical logic
・ Topical medication
・ Topical Press Agency
・ Topical song

Dictionary Lists

mini英和辞書

翻訳と辞書　辞書検索 [ 開発暫定版 ]

スポンサードリンク

Topic model ：ウィキペディア英語版

Topic model
In machine learning and natural language processing, a topic model is a type of statistical model for discovering the abstract "topics" that occur in a collection of documents. Intuitively, given that a document is about a particular topic, one would expect particular words to appear in the document more or less frequently: "dog" and "bone" will appear more often in documents about dogs, "cat" and "meow" will appear in documents about cats, and "the" and "is" will appear equally in both. A document typically concerns multiple topics in different proportions; thus, in a document that is 10% about cats and 90% about dogs, there would probably be about 9 times more dog words than cat words. A topic model captures this intuition in a mathematical framework, which allows examining a set of documents and discovering, based on the statistics of the words in each, what the topics might be and what each document's balance of topics is.
Although topic models were first described and implemented in the context of natural language processing, they have applications in other fields such as bioinformatics.
==History==
An early topic model was described by Papadimitriou, Raghavan, Tamaki and Vempala in 1998.
Another one, called Probabilistic latent semantic indexing (PLSI), was created by Thomas Hofmann in 1999. Latent Dirichlet allocation (LDA), perhaps the most common topic model currently in use, is a generalization of PLSI developed by David Blei, Andrew Ng, and Michael I. Jordan in 2002, allowing documents to have a mixture of topics. Other topic models are generally extensions on LDA, such as Pachinko allocation, which improves on LDA by modeling correlations between topics in addition to the word correlations which constitute topics.

抄文引用元・出典: フリー百科事典『ウィキペディア（Wikipedia）』
■ウィキペディアで「Topic model」の詳細全文を読む

スポンサードリンク

翻訳と辞書 : 翻訳のためのインターネットリソース